Identification of molecular targets and small drug candidates for Huntington's disease via bioinformatics and a network‐based screening approach

Abstract Huntington's disease (HD) is a gradually severe neurodegenerative ailment characterised by an increase of a specific trinucleotide repeat sequence (cytosine–adenine–guanine, CAG). It is passed down as a dominant characteristic that worsens over time, creating a significant risk. Despite being monogenetic, the underlying mechanisms as well as biomarkers remain poorly understood. Furthermore, early detection of HD is challenging, and the available diagnostic procedures have low precision and accuracy. The research was conducted to provide knowledge of the biomarkers, pathways and therapeutic targets involved in the molecular processes of HD using informatic based analysis and applying network‐based systems biology approaches. The gene expression profile datasets GSE97100 and GSE74201 relevant to HD were studied. As a consequence, 46 differentially expressed genes (DEGs) were identified. 10 hub genes (TPM1, EIF2S3, CCN2, ACTN1, ACTG2, CCN1, CSRP1, EIF1AX, BEX2 and TCEAL5) were further differentiated in the protein–protein interaction (PPI) network. These hub genes were typically down‐regulated. Additionally, DEGs‐transcription factors (TFs) connections (e.g. GATA2, YY1 and FOXC1), DEG‐microRNA (miRNA) interactions (e.g. hsa‐miR‐124‐3p and has‐miR‐26b‐5p) were also comprehensively forecast. Additionally, related gene ontology concepts (e.g. sequence‐specific DNA binding and TF activity) connected to DEGs in HD were identified using gene set enrichment analysis (GSEA). Finally, in silico drug design was employed to find candidate drugs for the treatment HD, and while the possible modest therapeutic compounds (e.g. cortistatin A, 13,16‐Epoxy‐25‐hydroxy‐17‐cheilanthen‐19,25‐olide, Hecogenin) against HD were expected. Consequently, the results from this study may give researchers useful resources for the experimental validation of Huntington's diagnosis and therapeutic approaches.


| INTRODUC TI ON
Huntington's disease (HD) is a degenerative brain ailment caused by an increase in CAG (cytosine-adenine-guanine) repeats in the IT15 gene, also known as the huntingtin (HTT) gene. 1,2It is transmitted as a dominant characteristic and is fully penetrant.The translation of Huntingtin leads to the creation of a very large polyglutamine domain near the N-terminus start of the Huntington protein. 3Due to the expansion of the CAG region, the mutated huntingtin protein (mHTT) becomes highly unstable.This instability leads to the aggregation of proteins of the same or different kinds and potential disruption of neurotransmission. 4,5complicated interplay between hereditary and environmental variables leads to HD, subsequent generations enduring more dire societal circumstances.[8] Prevalent signs and indications include loss of fine motor function, anomalies in the cerebellum, gait abnormalities, dysarthria, cognitive problems and stiffness. 9It causes significant physical and cognitive impairments, including memory loss, sadness, mood swings, clumsiness and several other psychological difficulties and diseases. 4,10When there is strong evidence of a motor condition, namely chorea, together with iatrogenic illnesses and general internal abnormalities, it is considered a clinical diagnosis of HD.While therapies exist for symptomatic relief, there is currently no cure for this debilitating brain disorder. 10,11rrently, there are now no drugs available to prevent symptoms and illness progression; nevertheless, there are various successful post-treatment (i.e. pharmaceutical and nonpharmacologic therapies available). 12Furthermore, HD prevalence varies globally, with some locations seeing elevated rates.Genetic testing has led to an increase in the occurrence of the illness in several groups.4][15][16][17] Since 1995, there has been a clear disparity in prevalence rates between Asian nations and white people, with the latter exhibiting greater rates. 16,18,19veral studies have suggested histone alterations, protein hubs, transcription factor (TF) difficulties and aberrant microRNA (miRNA) levels as possible indications for diagnosing HD.Diagnosing (HD) in its early stages is difficult due to limits in accuracy, precision and expense. 20,21Furthermore, conflicts among researchers have developed about the interpretation of differentially expressed genes (DEGs).Utilising brain cell analysis for early HD identification and therapy sounds promising.The faulty gene responsible with HD was found in 1993, allowing genetic testing for diagnosis. 22The genetic test for HD may identify the faulty HTT protein gene, exposing genetic defects in persons without symptoms who are at risk of acquiring the condition later in life. 22,23The HTT gene's amplification of CAG triplet repeats cause the manufacture of pathogenic HTT protein residues that are resistant to regular cellular factions, which is the cause of HD. 22,[24][25][26] Most HD patients first suffer motor difficulties, whereas a tiny minority (around 15%) display psychological symptoms before motor abnormalities emerge. 10,27The most common manifestation of HD is a specific degeneration of the brain's corpus striatum with no appreciable abnormalities found in the peripheral tissues. 10,28otein hubs, TF deficits, histone changes and aberrant miRNA expression are implicated in HD, and the development of diseasespecific biomarkers is vital for evaluating HD treatments. 29,30rious research has focused on discovering molecular biomarkers utilizing brain and blood data.Investigating molecular biomarkers indicating brain expression patterns relevant to HD development is critical. 21,22,24,31e objective of this work is to identify molecular biomarkers that indicate the alignment of brain expression patterns with associative factors linked to HD progression.An investigation of DEGs in HD patients was undertaken using data from the GEO database (GSE97100 and GSE74201).An evaluation of probable processes and important genes for HD was carried out, including the overrepresentation analysis of KEGG pathways, protein-protein interaction (PPI) network, and Gene Ontology (GO) analysis.Overall, our investigation included cross-validating the suggested candidate drugs with the latest alternative target proteins utilizing molecular docking as well as molecular dynamics simulation approaches to investigate the probable processes of HD. connected to DEGs in HD were identified using gene set enrichment analysis (GSEA).
Consequently, the results from this study may give researchers useful resources for the experimental validation of Huntington's diagnosis and therapeutic approaches.

| Data collection
Datasets for the study were obtained from the National Centre for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) repository. 32Homo sapiens data relevant to HD was searched, resulting in 786 datasets.The majority were originally dismissed owing to numerous reasons such as being noncoding, having limited samples, being repetitive, having inadequate formats, improper experimental settings, missing control participants, or coming from non-human organisms. 19,30Ultimately, after careful consideration, two datasets were selected for analysis: GSE97100 containing HDs mRNAseq data and control patient induced pluripotent stem cells-derived brain microvascular endothelial cells, and GSE74201, which included genomic analysis that highlights disruptions in striatal neuronal development and potential treatment areas in a human neural stem cell model of HD.

| Identification of DEG and common gene
Gene Set Enrichment Analysis helps in discovering DEGs from a huge gene pool related to illness symptoms by applying various statistical approaches. 19,33Figure 1 illustrates the stages engaged in data collecting and notably, GSE97100 and GSE74201 had 46 similar genes, all demonstrating a down-regulation trend.
Following analysis based on characteristics like p value <0.05 and the absolute values of log 2 Fold Control (FC), DEGs were found using linear models using tools like limma from Bioconductor via R, GEO2R and GREIN. 34,35p-Value adjustment was conducted employing the Benjamini-Hochberg method, concentrating on false discovery rates.Qe represents the expected value of Q 36

and it implies that
The random variable denotes the proportion of errors resulting from null hypotheses that are erroneously rejected, and where V = number of significant true null hypotheses, S = number of significant nontrue null hypotheses and R = the number of hypotheses that are rejected.
In conclusion, it is apparent that, Overlap analysis of differentially expressed genes (DEGs) was conducted using a Venn diagram to identify common genes.

| Enrichment analysis of gene sets
The Enrichr bioinformatics application was used to annotate the KEGG pathway, Reactome pathway and gene ontology of the DEGs. 19,379][40] The reactome pathway along with the KEGG pathway plays a significant role in comprehending cellular metabolism. 30,41

| Detection of protein-protein interactions
Protein interactions, including protein-protein interaction (PPI), are crucial for biological activities, influencing various biological functions like shape, affinities and permanency of contact. 34,42,43PPI networks assisted in identifying protein hub interactions using the STRING database and a specific confidence score of 500 (adjusted p-value less than 0.05). 44We utilised Cytoscape software (v.3.10.1) 45 comprising MCODE and cytoHubba applications 44 was used to find hub genes.

| Identification of transcriptomic regulators
Regulatory molecules like TFs and miRNAs were considered potential regulators affecting DEG expression utilizing Network Analyst. 46,47RNAs hold promise in gene regulation and represent a potential therapeutic target class with developmental roles, impacting diverse physiological activities over time. 19,48

| Protein selection and preparation
Out of the proteins involved in HD formation, we chose one protein that is expressed at unregulated levels for analysis using computer simulations.The proteins of interest in this investigation include P29279-CCN2_HUMAN (Protein Accession: P29279).The chosen protein P29279-CCN2_HUMAN (Protein Accession: P29279) was downloaded from UniProt. 49The downloaded usual structural format of protein through Uniprot is used straight away for conducting our molecular docking analysis.

| Ligand library preparation
HOSSAIN et al.

| Molecular docking
Molecular docking examinations on the previously pointed-out selected marine-derived compounds against P29279-CCN2_HUMAN (Protein Accession: P29279) were performed in auto-dock Vina using PyRx software, which is openly available and built for molecular docking research investigations.PyRx provides a docking wizard with a simple-to-operate user interface that renders it an essential tool for computer-aided drug design.PyRx further contains a chemical spreadsheet-like practically and sophisticated display system that is required for rational drug development. 50,51The identified pharmacological targets were energy minimized and shifted them into pdbqt format via PyRx.

| ADMET
Adsorption, distribution, metabolism and excretion (ADME), as well as drug-likeness, are vital aspects in assessing the number of potential medicinal compounds that are going to be authorized to advance with clinical trials.Three compounds that displayed the greatest binding energy with P29279-CCN2_HUMAN (Protein Accession: P29279) were then submitted to the Swiss ADME and pkcsm web server 52,53 in forecasting their pharmaceutical prognosis and measuring drug similarity, characteristics are evaluated based on Lipinski's Rule of Five. 54ADME is a publicly available web-based application offering robust and swift models for computing drug-likeness, pharmacokinetic properties and chemistry treatment of phytochemicals. 52[57]

| Post-docking data visualization
The top binding forms, determined by the lowest binding free energy, were selected for three investigated ligands.The interactions between protein and ligands have been further illustrated using Biovia Discovery Studio (https:// disco ver.3ds.com/ disco very-studi o-visua lizer -download) 14,58 and Ligplot+ (https:// www.ebi.ac.uk/ thorn ton-srv/ softw are/ LigPl us/ downl oad2.html). 59A detailed investigation of every protein-ligand pair was done to seek out hydrogen bonds (H-bonds), interacting amino acids and particular atoms connected with every ligand. 56

| Molecular dynamics simulation
The stability of the protein-ligand complex suggesting less dissociation and greater association may be measured using computational approaches.To test the stability of a ligand and its complex with the receptor, a computational molecular dynamics (MD) simulation was done using the Maestro Package inside the Schrödinger platform (paid version) on a Linux OS.The thermodynamic stability was tested utilising the OPLS3e force field with a TIP3P water model. 14,60The system was built up with orthorhombic periodic boundary conditions at a distance of 20 Å. Sodium and chlorine ions were introduced for electrical neutralization utilising the OPLS3e force field.MD simulations were done under constant temperature and pressure circumstances (NPT ensemble) 61 at 300 K and 1 atm utilising Nose-Hoover temperature coupling and isotropic scaling. 62,63A 100 ns simulation was running, storing settings every 100 ps.Stability was examined using statistical methods such as radius of gyration (rGyr), molecular surface area (MoLSA), root mean square fluctuation (RMSF), root mean square deviation (RMSD), solvent accessible surface area (SASA) and intermolecular bond.

| Identifying DEGs
Initially, we applied the robust multi-array average (RMA) approach to normalize the gene expression patterns.Subsequently, we applied the linear models for microarray data (LIMMA) statistical approach to examine the normalized dataset.This analysis revealed 1773 genes that exhibit differential expression (DEGs) between control and the case samples, with significance determined with adjusting p-value of 0.05 and a minimum log 2 fold change threshold of 1, as depicted in Figure 2.Among these DEGs, 139 were up-regulated, whereas 1544 were downregulated.Further study was performed based on these DEGs.The transcriptome data for HD identified 46 common DEGs, covering down-regulated genes from the overlapping DEGs (see Table 1 and Figure 2).Interactive heat maps were created for both the control and case samples using the given datasets presented in Figure 3E,F.

| Detection of gene ontology and pathway enrichment
Enrichr was utilised to conduct GO term & pathway enrichment analyses, aiming to unveil the biological significance as well as identify enriched pathways linked to our research.GO analysis was applied to understand the biological functions, cellular processes and molecular activities of the DEGs.The top 10 terms within the fields of molecular activities, biological processes and cellular functions are presented in Table 2. Additionally, to determine the biological pathways enriched by the shared DEGs, functional enrichment analysis This study identified significant differentially expressed genes (DEGs) associated with HD and analysed related KEGG and Reactome pathways.Using a PPI network and transcriptomic analysis, potential drug targets were identified from the HD expression dataset.Additionally, potential drug candidates targeting these molecules were also discovered.was carried out.Through the Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway study, important pathways such as axon guidance, calcium signalling pathway and oxidative phosphorylation were found.In terms of Reactome pathways, enrichment was identified in pathways such as Translation Initiation Complex Formation and SUMOylation Of Ubiquitinylation Proteins.Figure 4A,B exhibit the KEGG pathway and highly enriched reactome pathway, respectively.

| Transcriptional regulator prediction
To investigate important changes occurring at transcriptional and post-transcriptional levels, we evaluated DEGs to find TFs and posttranscriptional regulatory molecules (miRNAs) according to their degree values.Through the finding of TFs and miRNAs targeting DEGs, we hoped to anticipate regulatory chemicals impacting gene expression regulation at these levels.

| Molecular docking analysis
Molecular docking is an approach that analyses the ligand's optimum binding position with the active region of a target. 64Using this approach, the binding region's 3D coordinate space is located in the target, and the binding affinity is computed to determine the orientation of the molecular structure inside the binding site that forms the complex. 64The most prominent negative value (either the lowest or greatest binding energy) indicates the most ideal complex configuration created when the ligand binds to the target's active sites, and this value is used to evaluate the importance and sensitivity of binding affinity data. 65Furthermore, the molecular dock-  F I G U R E 2 Identification of commonly expressed genes that are upregulated (A) and downregulated (B).
in Table 4.It was found that the binding affinity of compounds var-  4).

| ADMET analysis
Measuring physiological and ADMET characteristics by implementing computational methods is a quick, effective and precise technique. 66We explored the physiological and ADMET characteristics It is essential to point out that all compounds displayed drug-like characteristics and followed the Lipinski rule. 67The physicochemical and pharmacological properties of the selected compounds are displayed in Table 3.

| MDS analysis
Molecular dynamics simulation (MDS) assessing the structural stability of atoms and molecules involves representing their systems at the atomic level.Also, molecular dynamics (MD) simulation stands out as a unique method for evaluating the stability of a ligand in association with a particular protein macromolecule.In this instance, a molecular dynamics (MD) simulation lasting 100 nanoseconds was conducted to determine the stability of the protein-ligand complex.The following was carried out to evaluate the ability of ligands to efficiently bind to the protein, particularly targeting its active site region.The results of the molecular dynamics (MD) simulation have been documented, relying on SASA, protein-ligand contact analysis (P-L contact), MoLSA, RMSF, RMSD and intramolecular hydrogen bonds (Intra HB).

| Root mean square deviation (RMSD) analysis
The root mean square deviation (RMSD) of a protein-ligand complex system calculates the average distance that a particular atom moves from its original position over a defined period.Typically, it involves taking the square root of the average of squared errors to calculate the amount of variation between two values, namely the observed and estimated values.The average, or mean value that variations across frames within the range of 1-5 Å or 0.1-0.5 nm are acceptable, but a value surpassing this range suggests a significant conformational change in the protein. 57The combined RMSD analysis of the complex structures involving the drug candidates CID-11561907 (depicted in green), CID-10621161 (in yellow) and CID-91453 (in orange) with the protein P29279-CCN2_HUMAN (Protein Accession: P29279) was conducted to observe alterations in their order, as illustrated in Figure 9.The RMSD values for the three compounds fell within the range of 3.6 Å to 4.8 Å, showing slight fluctuations between 4 and 28 ns that were entirely within acceptable limits.

| Root mean square fluctuation (RMSF) analysis
The root mean square fluctuation (RMSF) may aid in detecting and determining the regional modifications that occur within the protein chain whenever a specific ligand interacts with certain protein residues. 57Consequently, the RMSF values for compounds CID:

| Radius of gyration (Rg) analysis
The distribution of atoms around the axis of the protein-ligand complex is known as its radius of gyration.Rg measurement has become among the most significant indicators for macromolecule structural stability because it reflects modifications to complex compactness.Therefore, in the study of Rg analysis, the compactness of selected compounds CID:11561907, CID:10621161 and CID:91453 in complex with P29279-CCN2_HUMAN (Protein Accession: P29279) protein was also studied during 100 ns simulation time in Figure 11.

| Solvent accessible surface area
Solvent-accessible surface area (SASA) is a method used to assess the polar and non-polar surface area of molecules, helping to understand how residues interact with the solvent. 68According to the findings shown in Figure 12

| Intramolecular bond
The protein-ligand interaction, notably involving hydrogen bonds, hydrophobic interactions and water bridges, greatly impacts drug selectivity, metabolism and absorption.Hence, the intermolecular interactions of the protein-ligand complexes were investigated using simulation interaction diagrams (SIDs) throughout a 100 ns simulation period.The interaction fraction values were determined for the protein and ligands, including compounds CID:10621161, CID:91453 and CID:11561907 and presented in Figure 14.Among these compounds, CID:91453, CID:10621161 and CID:11561907 demonstrated the greatest interaction fraction values of 0.9, 0.65 and 0.9, respectively, at amino acid residues GLU 269, GLU 327 and ASP 37.These interactions featured numerous bonds, suggesting strong binding.Compounds CID:10621161 and CID:91453 demonstrated the greatest water bridges and hydrogen bonds, indicating improved compound stability.

| DISCUSS ION
Huntington's disease an inherited neurologic illness often arises in adults in their forties and fifties. 25,69[74][75] Currently, omics-based methodologies are widely applied in biomedical and systems biology investigations, showing to be important tools in dissecting disease pathophysiology, uncovering molecular pathways and generating biomarkers for diverse illnesses. 34evious discoveries show that gene expression may be controlled during various phases of RNA processing, protein post-translational modifications (PTMs), translation, or genetic modifications. 76,77aracterizing target protein activities in bioactive compounds is critical for defining the biochemical route of a certain illness and understanding the participation of basic processes in a particular phenotype. 34,78Investigation into hub proteins has received interest, with protein-protein interactions (PPI) defined as either persistent or transient depending on their length and function. 30,79tworks built using PPI are considered scale-free, with component connections usually following a Poisson distribution. 80Integrating a network-based method with genomic data assists in detecting relationships between diverse biological processes and activities, leading to the identification of novel pathways, interaction networks and disease-related signals that help establish biomarkers and therapeutic targets. 81While previous studies have examined miRNA expression levels in cellular as well as mouse models, 75 shared gene expression patterns in individuals with HD, 82 and DNA methylation in HD. 72 There is a lack of comprehensive bioinformatics analysis that investigates molecular markers and pathways in both healthy persons and those with HD.To overcome this gap, a complete bioinformatics strategy was utilised to disclose molecular markers and essential pathways for HD in this work, offering an encompassing view.
Using the term "HD," we ran a search in the GEO database, extracting records containing mRNA expression profiles of Homo sapiens.After extensive study of the available literature, these datasets were separated into control and illness groups. 31Employing a bioinformatics method, we evaluated DEGs across these groups, indicating substantial differences in gene expression among HD patients compared to neurologically healthy controls.Two datasets, GSE97100 and GSE74201, were selected, leading to the identification of 46 commonly down-regulated genes through systematic and statistical approaches (Figure 2).

TA B L E 4
List of compound identity, the chemical name of selected best three ligands and the binding affinity of ligands with P29279-CCN2_HUMAN (Protein Accession: P29279) receptor and the thorough intermolecular interactions between them.Gene ontology (GO) analysis was then undertaken to study the biological implications of these 46 DEGs linked with HD.The relevant studies have also identified GO terms related to signal transduction in the Biological Process 73,83 and plasma membrane in the Cellular Component. 74In our analysis, we observed enrichment of molecular function related to TF activity and sequence-specific DNA binding by gene ontologies.Additionally, a previous study demonstrated the relevance of TF activity in molecular function 3,84 to HD.By merging historical data with our results, new treatment targets or putative pathogenic pathways for future investigation may be identified.
These hub proteins are considered to have essential roles in the pathways causing the illness. 75Subsequently, a protein interaction network concentrating on DEGs was rebuilt to reveal important hub proteins contributing to the genesis and progression of HD.For example, TPM1 encodes Tropomyosins, a well-conserved group of actin-binding proteins vital in different physiological processes. 85mans contain four tropomyosin genes that undergo alternative F I G U R E 5 Top 10 Hub genes identified by Cytoscape.

F I G U R E 6
The Network Analyst server was applied to visualize and forecast hub proteins within the Protein-Protein Interaction (PPI) Network.A proteinprotein interaction (PPI) network was developed, emphasizing the top 10 hub genes and their interactions with additional Differentially Expressed Genes (DEGs).A confidence score of 500 was utilized in the construction of this network using the STRING interactome database.The visual representation depicts the top 10 hub genes as blue nodes, DEGs as green nodes and the degree of interactions among DEGs as blue edges.Larger nodes indicate hub proteins, while smaller nodes represent DEGs.splicing, yielding different isoforms critical for muscle function and involved in many muscle-related diseases. 86e EIF2S3 gene encodes the γ subunit of the eIF2 complex, crucial for beginning protein synthesis and regulating stress response. 879][90][91] Additionally, CTGF/ CCN2 is a matricellular protein of the CCN family, involved in different cellular processes such as cell proliferation, motility and ECM synthesis.Its fibrotic action has been widely investigated, notably in illnesses involving fibrosis such as Duchenne muscular dystrophy. 92,93Therefore, Alpha actinin 1 helps tie the myofibrillar actin filaments to the Z-line and it's a crucial player in muscle contraction.The sarcomeric Z-line acts by joining "titin and actin filaments from opposing sarcomere halves in a lattice connected by alpha-actinin." 94Mammals, such as humans, have four α-actinin encoding genes (ACTN1, ACTN2, ACTN3 and ACTN4).ACTN1 was researched for the aim of this experiment.Actinins are crucial for muscular contraction, and disruption of their normal function may lead to muscle conditions such as hereditary inclusion body myopathy. 95Previous investigations have shown alternatively spliced mRNAs of ACTN1. 96,97ere are two alternatively spliced isoforms; for this research, they include Titin-L and Titin-S.In addition, CCN1 has shown neuroprotective benefits in several neurodegenerative disorders.
For instance, Chen et al. 98 revealed that CCN1 protected against ischemia-induced neuronal damage in rats.Although this work did not explicitly address HD, it shows that CCN1 may have neuroprotective qualities that might be relevant to HD, where neurons are prone to injury and degeneration. 99 TCEAL5 has been linked to cellular differentiation processes.In neurodegenerative illnesses like HD, maintaining appropriate neuronal development and function is critical for reducing disease progression.Although there may not be direct evidence connecting TCEAL5 to HD, its involvement in cellular differentiation might alter neuronal integrity and function in the setting of HD. 103,104 The exploration of directory biomolecules as possible biomarkers for major diseases like neurodegenerative disorders is increasing. 29,34,105,106We examined how TFs and miRNAs are involved in regulating DEGs via TF-DEG. 73,82MicroRNAs a vital function in gene expression control and show potential as biomarkers for HD and other illnesses.Several miRNAs are anticipated to have a role in the pathogenesis of HD. 2,107 Our research unveiled the most noteworthy transcriptomic factors (TFs), that is, GATA2, E2F1, HINFP, TFAP2A, RELA, CREB1, FOXC1, NFKB1, YY1 and USF2 were identified as significant TFs, while hsa-mir-1-3p, has-mir-1303, hsa-mir-26b-5p, hsa-mir-1277-5p, hsa-mir-133a-3p, hsa-mir-16-5p, hsa-mir-205-5p, hsa-mir-21-5p, hsa-mir-218-5p and hsa-mir-124-3p were discovered as top miRNAs implicated in HD (Figure 7).FOXC1, GATA2 and YY1 were found as regulatory TFs in several neurological conditions, such as Alzheimer's disease and various others. 71,105In a research investigation, scientists observed an elevation in TFAP2A nucleoid signal concentration within specific micropattern colonies associated with HD. 108 Downregulation of hsa-miR-124-3p is involved in various neurological disorders such as Alzheimer's disease and HD in mice and humans. 109Thus, abnormal expression of miR-124 has been detected in HD, ischemic stroke, Alzheimer's disease, Parkinson's disease and hypoxic-ischemic encephalopathy. 109,110Besides, other miRNAs include hsa-miR-26b-5p, which is connected with neuronal differentiation, development and vitamin D metabolism. 111,112 recent times, in-silico drug design has emerged as a crucial and essential technology in modern drug development, offering the potential to significantly decrease the cost, time and labour associated with the drug discovery process. 113,114By enabling scientists to target their biological and synthetic research efforts more precisely than was previously feasible, it has aided in the development of novel medications by lowering costs and shortening research timeframes. 115,116wadays, a lot of researchers are using in-silico drug design techniques since they can speed up the process of creating highquality drugs.This is accomplished by Molecular Dynamic Simulation (MDS), post-docking interaction analysis, molecular docking result assessment and computer-aided drug discovery of potential therapeutic compounds for various diseases. 113,114Molecular docking is a method used to forecast how molecules may interact optimally in terms of their structure and with the least potential binding strength.
The molecular docking method was initially used to choose medications based on which ones had the lowest binding affinities.The notable docking outcome indicates that three compounds out of 500 marine-derived compounds could be considered for further exploration as potential therapeutic candidates against HD by targeting
protein and gene expression pathways.Three marine-derived compounds, namely cortistatin A, 13,16-Epoxy-25-hydroxy-17-cheilanth en-19,25-olide and hecogenin, exhibited the most favourable docking scores of −9 kcal/mol, −8.8 kcal/mol and −8.6 kcal/mol, respectively, when interacting with the protein P29279-CCN2_HUMAN (Protein Accession: P29279) and exhibit considerable inhibitor against HD.Analysing the binding interaction, strong hydrophobic and hydrogen interactions were found between the ligands and the protein (Figure 8).Furthermore, molecular dynamics simulations can be employed to validate the stability of a protein within a compound that includes ligands. 117,118Additionally, it can assess the flexibility and stability of complexes formed between proteins and ligands within a defined simulated environment, such as the human body.
The RMSD values of the complex system show the compounds ideal stability, whereas the RMSF values of the protein-ligand complex measure its compactness and represent the average fluctuation. 14,119The minimum change in the protein structure is validated by the system's RMSD, which is calculated using the Cα atoms of the protein-ligand complexes.The protein's fluctuation was also computed using the RMSF value, which verifies the complex system's complexed with certain ligands. 14,120,121The results demonstrated that throughout the simulation, all compounds formed numerous connections via ionic bonds, hydrophobic bonds, hydrogen bonds and water bridge bonds.Importantly, these connections persisted until the conclusion of the simulation, promoting the establishment of a stable binding with the targeted proteins.
Our findings showed that three potential drug candidates (cor-

A
total of 500 marine-derived compounds were obtained from various literature studies.It contains haliclonacyclamine F, 13,16-Epoxy -25-hydroxy-17-cheilanthen-19,25-olide, Terreulactone C and many more.The aforementioned structure has been downloaded from PubChem databases including possible structural specifications file format for docking analysis.

Figure 6
Figure 6 graphically depicts the protein-protein interaction (PPI) network developed using network analysis, with an emphasis on emphasizing protein hubs.The top 10 hub genes discovered by Cytoscape (v3.10.1) are presented.This graphical depiction demonstrates the relationships among these hub genes, calculated using both Network Analyst and Cytoscape tools (v3.10.1).The PPI network was investigated to discover protein hubs among the DEGs, which comprised 46 common genes.Through this study, hub proteins were found, including TPM1, EIF2S3, CCN2, ACTN1, ACTG2, CCN1, CSRP1, EIF1AX, BEX2 and TCEAL5.These hub genes may act as crucial indications involved in the course of HD.
ing was carried out to confirm against Huntington for effectiveness of the marine-derived compounds obtained from the literature study by assessing binding modalities as well as orientations of ligands in the receptor pocket of P29279-CCN2_HUMAN (Protein Accession: P29279) target.The resulting docking result is displayed TA B L E 1 A statistical summary of gene expression in the datasets utilized for analysis.

F I G U R E 4
Significantly enriched (A) KEGG pathway, (B) reactome pathway.
11561907, CID:10621161 and CID:91453 while in combination with P29279-CCN2_HUMAN (Protein Accession: P29279) were computed.This aimed to discern alterations in protein flexibility due to the binding of different ligand molecules to a specific protein residue site, as illustrated in Figure 10.When RMSF's change value is greater than 5 Å, it is considered a tangible and significant change that occurs in amino acid residue-specific flexibility.The RMSF graph indicated average low and significant values of the P29279-CCN2_ HUMAN (Protein Accession: P29279)-CID: CID:91453 (Orange) (4-5 Å), P29279-CCN2_HUMAN (Protein Accession: P29279)-CID:10621161 (Yellow) (3.3-4.5 Å) and P29279-CCN2_HUMAN (Protein Accession: P29279)-CID: 11561907 (Green) (3.1-4.8Å), showing slight fluctuations between 5 and 50 and 170 and 200 residue index, indicating that the natural compounds were strongly bound to P29279-CCN2_HUMAN (Protein Accession: P29279) protein in terms of their average positions and shown in Figure 10.

102 F I G U R E 7
,100 BEX2 has been involved in enhancing neuronal differentiation and neurite outgrowth.In neurodegenerative illnesses such as HD, preserving neuronal integrity and encouraging neural plasticity are critical for minimising disease progression.BEX2's involvement in neuronal development might alter the survival and function of neurons damaged by HD disease. 101,The molecules identified as the transcriptomic signature from the Network Analyst server are visualized in two parts:(A) The top 10 transcription factors (TFs) linked to differentiated expressed Genes (DEGs) are displayed within the network.In this visualization, red nodes signify TFs, ash-coloured nodes represent DEGs and the edges indicate interactions among DEGs (B) The top 10 microRNAs (miRNAs) associated with DEGs are showcased.The diagram displays brown square nodes for miRNAs and red nodes for DEGs.The connections between the nodes signify interactions between the DEGs.
minimal fluctuation and indicates the compounds stability with respect to the target protein.Many other metrics were examined to evaluate the complex's flexibility and stability, such as solventaccessible surface area (SASA), radius of gyration (Rg), number of hydrogen bonds and MoLSA.In this study, a 100-nanosecond Molecular Dynamics Simulation (MDS) was carried out utilising the necessary physicochemical and physiological parameters, employing the Schrödinger software package (specifically, the Desmond application).Except for minor permissible fluctuations, cortistatin A, 13,1 6-Epoxy-25-hydroxy-17-cheilanthen-19,25-olide and hecogenin displayed comparable similar RMSD and RMSF values with the protein P29279-CCN2_HUMAN (Protein Accession: P29279) (Figures 9 and 10).The simulation results for various parameters, including the radius of gyration (Rg), hydrogen bond number, solvent accessible surface area (SASA) and MoLSA, were favourable when simulating with proteins linked with the development of HD (Figures 11-14).This suggests the potential development of these compounds into medications for HD.Additionally, the Simulation Interaction Diagram (SID) was used in a 100-nanosecond simulation to examine the intermolecular interactions that occur between proteins when they are F I G U R E 9 RMSD values of the complex structure derived from Cα atoms are shown in the line graph, viz: CID-11561907 (depicted in green), CID-10621161 (in yellow) and CID-91453 (in orange).F I G U R E 1 0 Showing the RMSF values taken from protein residues Cα atoms of the complex structure, viz CID-11561907 (depicted in green), CID-10621161 (in yellow) and CID-91453 (in orange).
tistatin A, 13,16-Epoxy-25-hydroxy-17-cheilanthen-19,25-olide and F I G U R E 11 The analysed radius of gyration results of selected three compounds CID:91453, CID: 10621161, CID: 11561907 with P29279-CCN2_ HUMAN (Protein Accession: P29279) protein are displayed by orange, yellow and green, respectively.F I G U R E 1 2 The 100 ns simulation diagram was used to determine the solvent-accessible surface area (SASA) of the protein-ligand interaction.The selected three ligands' compounds CID:91453 (Orange), CID: 10621161 (Yellow) and CID: 11561907 (Green) in association with the selected protein.F I G U R E 1 3 The molecular surface area (MoLSA) of the protein-ligand interaction was computed from the 100 ns simulation interaction diagram.The selected three ligands CID:91453 (Orange), CID: 10621161 (Yellow) and CID: 11561907 (Green) in connection with the selected protein.hecogenin) were identified based on low binding affinity and adherence to Lipinski's rule of five.Molecular dynamics simulations validated these findings.This research aids in designing effective HD therapeutics, with possibilities for additional wet lab inquiry.5 | CON CLUS ION This study aimed to identify important biomolecules and associated biochemical pathways utilizing integrative bioinformatics analysis.Ten DEGs, namely TPM1, EIF2S3, CCN2, ACTN1, ACTG2, CCN1, CSRP1, EIF1AX, BEX2 and TCEAL5, were identified as hub-DEGs, presumably playing critical roles in HD development out of a total of 1743 DEGs.Enrichment analysis of these DEG through the gene ontology (GO) database unveiled significant functions such as Diphosphotransferase Activity, alpha-Nacetylgalactosaminide Alpha-2,6-Sialyltransferase Activity, Purine Ribonucleoside Triphosphate Binding, Protein Homodimerization Activity, ATP Binding, LRR Domain Binding, Ubiquitination-Like Modification-Dependent Protein Binding, Adenyl Ribonucleotide Binding, Translation Initiation Factor Activity and Mannosyl-Oligosaccharide1,2-Alpha-Mannosidase.Potential regulatory biomarkers for both DEGs as well as hub-DEGs, including projected regulatory TFs and miRNAs (such as hsa-miR-124-3p as well as has-miR-26b-5p), were found.By employing molecular docking alongside cross-validation, we found three top-ranked potential drug candidates (cortistatin A, 13,16-Epoxy-25-hydroxy-17-cheila nthen-19,25-olide, Hecogenin) according to lowest binding affinity and followed Lipinski rule of five.By doing simulations with the molecular dynamics (MD) approach and consulting relevant literature, these results were confirmed.Consequently, the outcomes of this research might help greatly in designing an effective therapeutic method for HD.Future wet lab investigations may expand upon these study results.AUTH O R CO NTR I B UTI O N S Md Ridoy Hossain: Conceptualization (equal); data curation (equal); formal analysis (equal); investigation (equal); methodology (equal); resources (equal); software (equal); validation (equal); visualization (equal); writing -original draft (equal).Md.Mohaimenul Islam Tareq: Data curation (equal); formal analysis (equal); investigation (equal); methodology (equal); resources (equal); software (equal); validation (equal); visualization (equal); writing -original draft (equal).Partha Biswas: Conceptualization (equal); data curation (equal); formal analysis (equal); investigation (equal); methodology (equal); resources (equal); software (equal); validation (equal); visualization (equal); writing -original draft (equal).Sadia Jannat Tauhida: Data curation (equal); formal analysis (equal); investigation (equal); methodology F I G U R E 1 4 The stacked bar charts represent the protein-ligands interactions found during the 100 ns simulation.Herein, showing the interaction of selected three compounds, whereas (A) CID: 91453, (B) CID: 10621161, and (C) CID: 11561907 in complex with the P29279-CCN2_HUMAN (Protein Accession: P29279), respectively.
Performed gene set enrichment analysis on the genes that exhibited differential expression from microarray data in Huntington's disease (HD) people.Top 10 enriched gene ontology (GO) terms aggregated in a table.